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Abstract — In this paper, we present a new data mining 
algorithm which involves Data mining for multi moving 
nodes in a adhoc computing environment and quest the 
mining results to develop multichannel data allocation 
schemes so as to improve the overall performance of a 
adhoc system. First, we propose an algorithm to capture 
the frequent multichannel user moving nodes from a set 
of log data in a mobile(adhoc) environment. The 
algorithm proposed is enhanced with the Data mining 
capability and is able to discover new moving nodes 
efficiently without compromising the quality of results 
and enhanced output obtained. Mining results of node 
moving patterns and the properties of data objects & 
data set in data mining, we develop multichannel data 
allocation schemes that can utilize the knowledge of 
mobile nodes in adhoc pattern for proper allocation of 
both personal and shared data. By employing the data 
allocation schemes, the occurrences of cost for remote 
access scan be minimized and the performance of a 
mobile adhoc system is thus improved. For personal 
data allocation(Individual channel), two data allocation 
schemes, which explore different levels of mining 
results, are devised: one utilizes the set level of mobile 
nodes(moving patterns) and the other utilizes the 
path level of mobile nodes(i.e., OSPF- routing(moving 
patterns) Performance of these data allocation schemes 
is comparatively analysed. It is shown by our simulation 
results that the knowledge obtained from the mobile 
nodes is very important in devising effective in 
multichannel data allocation schemes which can lead to 
significant performance improvement in a mobile adhoc 
system(mobile computing). 

Keywords- Data mining, mobile computing, user moving 
patterns, data allocation scheme, mobile database, adhoc 
networks. 

I. Introduction 

Due to recent technology advances, an increasing number of 
users are accessing various information systems via wireless 
communication. Such information systems as stock trading, 



banking, and wireless conferencing, are being provided 
by information services, and application providers [13], [19], 
and mobile users are able to access such information via 
wireless communication from anywhere at any time [4], [9], 
[29]. For cost-performance reasons, a mobile computing 
system is usually of a distributed server architecture 
[13], [19], in which a service area, referring to the converge 
area where the server can provide services to mobile users, 
contains one or many cells where a cell refers to a 
communication area covered by a base station. In general, 
mobile users tend to submit transactions to servers nearby for 
execution so as to minimize the communication overhead 
incurred [13], [19] [28]. The properties of data objects 
accessed by mobile users can usually be divided into two 
types: the read-intensive type (or read type) and the update 
intensive type (or update type). Data objects are assumed to 
be stored at servers to facilitate coherency control and also 
for memory saving at mobile units [31], [34]. Since the 
architecture of a mobile computing system is distributed in 
nature, data replication is helpful because it is able to 
improve the execution performance of servers and facilitate 
the location lookup of mobile users [17], [28], [31], [34]. 
The replication scheme of a data object involves how many 
replicas of that object to be created and to which servers 
those replicas are allocated. Clearly, though avoiding many 
costly remote accesses, the approach of data replication 
increases the cost of data storage and update. Thus, it has 
been recognized as an important issue to strike a compromise 
between access efficiency and storage cost when a data 
allocation scheme is devised. It is noted that various data 
allocation schemes have been extensively studied in the 
literature [31], [34]. However, the data allocation schemes 
for traditional distributed databases are mostly designed in 
static manners and the user moving patterns, which are 
particularly relevant to a mobile computing system where 
users travel between service areas frequently, were not fully 
explored. As mentioned above, the server is expected to take 
over the transactions submitted by mobile users and static 
data allocation schemes may suffer severe performance 
problems in a mobile computing system. An example 
scenario is given in Fig. 1, where without loss of generality, 
we assume that the network topology of a mobile computing 
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system is a four by four mesh topology. 
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Figure 1 . Structure of Mobile Computing 

Suppose that the data are replicated statically at sites A, F, 
K, and P under the data allocation schemes for traditional 
distributed databases, and the mobile user Ul is found to 
frequently travel in service areas of A, B, and C (i.e., {A, B, C} 
is called the moving pattern of mobile user Ul). It can be seen 
that the advantage of having replicas on F, K, and P cannot be 
fully taken by mobile userUl, and the extra cost of maintaining 
those replicas is not justified by the moving pattern of user Ul. 
In order to improve the system performance, efficient data 
allocation schemes based on moving patterns of mobile users 
are very important in a mobile computing environment. This is 
the very problem we shall address in this paper. More 
justification on the problem studied can be found in section 2.3. 
Consequently, we shall explore in this paper the approach of 
mining user moving patterns in a mobile computing 
environment and utilize the mining results to develop data 
allocation schemes. Note that the data allocation schemes 
devised not only utilize the mining results but also consider 
the properties of data objects for improving the overall 
performance of a mobile system. Specifically, for the 
development of data allocation schemes, we shall first devise 
an algorithm to capture the Frequent user moving patterns 
from a set of log data(referred to as movement log) in a 
mobile environment. 

It is worth mentioning that to fully explore the temporal 
locality, which refers to the feature that consecutive 
movements of a mobile user are likely to fall into similar sites 
[11], the mining algorithm devised is enhanced with an Data 
Mining capability in the sense that it is able to focus on the 
recent moving patterns within an adjustable window size 
(referred to as retrospective factor) when deriving user moving 
patterns. As such, by avoiding generating user moving 
patterns for every single movement of individual mobile users, 
the frequent user moving patterns can be obtained much more 
efficiently without compromising the quality of results 
obtained. Then, in light of mining results of user moving 
patterns, we devise data allocation schemes that can utilize the 
knowledge of user moving patterns for proper allocation of 
both personal data (referring to those data only accessible by 
each individual data owner) and shared data (referring to those 
data to be accessed by a group of users). By employing the 
data allocation schemes devised, the occurrences of costly 
remote accesses can be minimized and the performance of a 
mobile computing system is thus improved. For personal data 
allocation, two data allocation schemes, which explore 
different levels of mining results and properties of data 
objects, are devised: One utilizes the set level of moving 
patterns and the other utilizes the path level of moving 
patterns, where the set level refers to the set of moving 



patterns and the path level refers to the ordered sites for 
prefetching determined from the moving patterns. As can be 
seen later, the former is useful for the allocation of read data 
objects, whereas the latter is good for the allocation of update 
data objects. The data allocation schemes for shared data, 
which are able to achieve local optimization and global 
optimization, are also developed. Performance of these data 
allocation schemes is comparatively analyses and sensitivity 
analysis on several design parameters, including the 
retrospective factor, is conducted. It is shown by our 
simulation results that the knowledge obtained from the user 
moving patterns is very important in devising effective data 
allocation schemes which can lead to significant performance 
improvement in a mobile computing system. Various data 
mining capabilities have been explored in the literature [5]. 
One of the most important data mining problems is mining 
association rules in transaction databases [1], [2] [7], [21], 
[27]. Also, mining classification rules is an approach to 
develop rules that can efficiently classify data items based on 
certain features [25], [30]. Mining sequential patterns is the 
study to explore the knowledge of ordered data [3], [14]. 
Though dealing with various mining capabilities, these prior 
results were not applicable to mining user moving patterns in 
a mobile environment. In[6], an algorithm for mining Web 
user traversal patterns was proposed. With the assumption that 
the backward reference is made for ease of traveling but not 
for data access in a Web environment, the mining algorithms 
and results in [6] were mainly developed subject to 
that assumption, which is, however, not applicable to mining 
user moving patterns in a mobile environment where all user 
movements have to be taken into consideration to truly reflect 
the user moving patterns. More importantly, the mining 
algorithm in [6] neither deals with data allocation nor has the 
capability of Data mining which is particularly important in a 
mobile environment. 

The very difference between these two environments calls 
for the design of a new algorithm for mining user moving 
patterns in a mobile environment. More justifications for the 
problem studied can be found in Section 2. In addition, a 
significant amount of research efforts has been elaborated 
upon issues of data allocation in distributed systems [10], [13], 
[16], [31], [34]. We mention in passing that the authors of [34] 
proposed a data distribution scheme that is based on the 
read/write patterns of the data objects. Given some user 
calling patterns, the authors of [3 1 ] proposed an algorithm that 
employed the concept of minimum-cost maximum-flow to 
compute the set of sites where user profiles should be 
replicated. Without fully exploiting user moving patterns, the 
attention of the studies in [31], [34] was mainly paid to the 
distribution of location data for mobile users, but not to the 
personal and shared data allocation that are explored in this 
paper. The contributions of this paper are twofold. We not 
only devise a new data mining algorithm for Data mining of 
user moving patterns in a mobile computing system, but also 
in light of the mining results obtained and the properties of 
data objects, develop data allocation schemes to improve the 
overall performance of a mobile computing System. 
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Figure 2. Multi model scheme using Data Mining 



With Data mining capability, frequent moving 
patterns of good quality can be obtained efficiently. Moreover, 
by exploring different levels of knowledge on user moving 
patterns, different data allocation schemes, which take the 
properties of data objects into consideration, are devised and 
their performance is comparatively analyzed. To the best of our 
knowledge, prior work neither fully explored the data mining 
capability for user moving Patterns in a mobile computing 
system nor utilized such mining results for personal and 
shared data allocation, let alone devising schemes to 
exploit different levels of knowledge mined and 
conducting the corresponding performance 

analysis. These features distinguish this paper from others. 
With the rapid advances in wireless technologies, the mobile 
computing systems are becoming widely available nowadays. 
The fast increase in mobile applications justifies the timeliness 
and importance of this study. 

This paper is organized as follows: Preliminaries are given 
in Section 2. Algorithms for mining user moving Patterns in a 
mobile system are devised in Section 3.1, data allocation 
schemes based on user moving patterns for personal data 
allocation are developed in Section 3.2, and those for shared 
data allocation are developed in Section 3.3. Simulation 
results are presented and analyzed in Section 4. This paper 
concludes with Section 5. 

IT. PRELIMINARY 

To facilitate the presentation of this paper, some 
preliminaries are given in this section. To justify the problem 
studied, some real applications of wireless data access are 
described in Section 3.1. A location management 
method and the generation of movement log used for 
mining user moving patterns in a mobile computing 
system are described in Section 2.2. The usefulness of 
moving patterns for the design of efficient location 
management is described in Section 2.3. 

2.1 Location management method 
Applications of Wireless Data Access Various wireless 



data networking technologies, including IS -136 
[32], CDMA2000 [18], and Wireless Application Protocol 
(WAP) [33], have been developed recently. Among others, 
WAP provides an open standard for connecting Internet 
content and advanced value-added services to mobile 
phones. For example, an emerging WAP calendar application 
[33] allows a mobile user to access his/her own calendar 
data through a mobile phone. Such calendar data of an 
individual mobile user is an example of personal data 
considered in Section 3.2 of this paper. On the other hand, 
corporate applications, such as sales force automation in which 
salespersons can instantly obtain the latest pricing and 
competitive information for their customers, access many data 
commonly used by many mobile users. These commonly used 
data are examples of shared data considered in Section 3.3 of 
this paper. As pointed out earlier, for cost performance reasons, 
application system which provides wireless data access 
services is of a distributed server architecture and the use of 
data replication is helpful in that the amount of remote access 
can thus be reduced, which in turn conserves the energy of 
mobile units. Consider an illustrative example in Fig. 2, where 
application servers provide calendar services. Suppose that Ol 
and 02 are the data objects which contain the calendar 
information, respectively, for mobile user 1 and mobile user 
2. If data object Ol is replicated at sites A, B, and C, where 
mobile user 1 frequently travels, then the amount of remote 
access and the communication overhead by mobile user 1 can 
be reduced since Ol can be obtained locally. Clearly, how to 
select proper sites for data allocation is a very important issue 
and will benefit from effective methods for discovering user 
moving patterns which will be dealt with in this paper. 

2.2 Generation of Movement Log 

Note that in a mobile environment each mobile user is 
associated with a home location database which maintains an 
up-to-date location data for the mobile user. The location 
management procedure for a mobile computing system 
considered in this paper is similar to the one in IS- 
41/GSM[12] [22], which is a two level standard and uses a 
two-tier system of home location register (to be referred to as 
HLR)and visitor location register (to be referred to as 
VLR)databases. Each mobile user is associated with an HLR, 
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whose database maintains recent mobile users' records and 
their current locations. When a mobile user moves out the 
area maintained by its HLR, a copy of the mobile user's 
record is created in its local VLR. In addition, the record in 
the HLR is updated to reflect the movement of that user. 




PmIMIw 1 Partition 2 



Figure 3. Structure of Mobility nodes during Data schemes. 

This procedure is referred to as registration. In order 
to capture user moving patterns, a movement log is needed. A 
movement log contains a pair of (old VLR, new VLR) in the 
database when registration occurs. In the beginning of a new 
path, the old VLR is null. For each mobile user, we can obtain 
a moving sequence f.01;Nl; 02;N2.; :::On;Nn.g from the 
movement log. Generally speaking, each node in the network 
topology of a mobile computing system can be viewed as a 
VLR and each link is viewed as the connection between 
VLRs. With such a mobile system considered, we propose 
algorithms for Data mining of user moving patterns and utilize 
the mining results to conduct proper data allocation in a 
mobile computing system. Consider an illustrative example in 
Fig. 3,1 where the number next to each link represents the 
sequence of movements of a mobile user. Thus, the movement 
log contains the moving path for that mobile user. As pointed 
out before, with the assumption that backward travels are 
made for ease of traveling and, hence, need not be considered, 
algorithm MF in [6] will terminate the discovery of maximal 
moving sequences when any backward reference occurs. The 
set of maximal moving sequences for this moving path 
output by algorithm MF is fABCDHGg.2 

However, such a fragmented moving sequence is of little 
interest in a mobile environment where one would naturally 
like to know the complete traveling information, i.e., 
fABCDHGHDCBAg in this case, showing the very difference 
between these two environments. Note that by taking into 
consideration the backward travels, nodes may appear more 
than once in the same maximal moving sequence and it is 
necessary to extract these sequences, and take their 
occurrences into account when the corresponding user moving 
patterns are evaluated. Clearly, to deal with these problems, as 
well as to solve such important issues as Data mining and data 
allocation which are particularly relevant to a mobile 
environment, it is essential to develop a new algorithm for 
mining user moving patterns in a mobile computing system. 

2.3 Mining Moving Patterns for Location 
Management: 

It is worth mentioning that, in addition to providing many 
benefits in several database applications, mining user Moving 
patterns is particularly important in a mobile computing 
system where the corresponding registration and hand-off 
overhead are costly. Here, we explain this advantage of 
mining user moving patterns for location management. As 
described before, each mobile user is mapped to an HLR, 
whose database always maintains most recent mobile user's 



record which includes current location. When the mobile user 
moves out of the area maintained by his/her HLR, the mobile 
user needs to inform HLR about the movement and HLR 
updates the database to reflect the mobile user's new location. 
However, the massive updates incurred will degrade the 
overall system performance. Several previous research efforts 
have been elaborated upon this update problem of HLR [15], 
[24]. The authors in [15] proposed a location management 
scheme in which mobile users do not have to inform the HLR 
for every single movement if the movement of the mobile user 
is within a bounded partition, where a bounded partition is 
defined according to user moving patterns. Consider the 
scenario in Fig. 3, for example. 

Assume partition 1 and partition 2 is user moving patterns 
of the mobile user. If each movement of the mobile user needs 
to inform HLR, there will be 10 update operations in HLR. In 
contrast, the location management scheme in [15] will incur 
an update only if the mobile user moves out of a partition 
(instead of "moves out of a site"), in which case, there are 
only three update operations in HLR incurred for the scenario 
in Fig. 3 (i.e., 1 (initial update at site A in partition 1) + 1 
(when the mobile user moves out of partition 1 in the 3th 
movement)+l (when the mobile user moves out of partition 2 
in the 8 th movement) = 3), thus reducing the overall overhead. 
However, it is important to note that the methods to discover 
user moving patterns for the determination of partitions, which 
are, in our opinion, very important in the solution procedure, 
were absent in [15]. Clearly user moving patterns are 
beneficial on developing efficient location management and 
querying strategy in a mobile computing system [15], [26], 
thereby justifying another motivation of this study. 

III. DATA ALLOCATION SCHEMES BASED ON 
MOVING NODES 

In this section, we develop a solution procedure, 
which is composed of a sequence of algorithms in the 
corresponding steps, to mine user moving patterns and 
improve data allocation in a mobile computing system. 
Specifically, with the user movement log given, we devise 
algorithms for Data mining of moving patterns in Section 
3.1. We develop an algorithm for determining maximal 
moving sequences (to be referred to as algorithm MM) in 
Section 3.1.1. 

Then, according to the maximal moving sequences 
determined, we devise an algorithm to identify large moving 
(to be referred to as algorithm LM) in Section 3.1.2 in an Data 
manner. The scenario for Data mining of moving patterns is 
illustrated in Section 3. 1.3. In light of the user moving patterns 
determined, we develop data allocation schemes for personal 
data allocation in Section 3.2 and for shared data allocation in 
Section 3. 3. Explicitly, a fix data allocation scheme (to be 
referred to as scheme DF) is presented in Section 3.2.1 first. 
By considering two levels of mining results, two data 
allocation schemes for personal data allocation are devised: 
one utilizes the set level of moving patterns (to be referred to 
as scheme DS), which favours the read data object, in Section 
3.2.2 and the other utilizes the path level of moving patterns 
(to be referred to as scheme DP), which favours the update 
data objects, in Section 3.2.3. Note that the set level refers to 
the set of moving patterns and the path level refers to the 
ordered sites for prefetching determined from the moving 
patterns. Data allocation schemes based on moving patterns 
for shared data allocation (to be referred to as algorithm SD- 
local and SD global) are devised in Section 3.3. 
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3.1 Data Mining for Moving Patterns in a Mobile 
Environmet 

Once the movement log is generated, we shall convert the 
log data into multiple subsequence, each of which Represents 
a maximal moving sequence. After maximal moving 
sequences are obtained, we then map the problem of finding 
frequent moving patterns into the one of finding frequent 
occurring consecutive subsequence among maximal moving 
sequences. A sequence of k movements is called a large k- 
moving sequence if there are a sufficient number of maximal 
moving sequences containing this moving sequence. Such a 
threshold number is called a support in this paper. Note that 
after large moving sequences are determined, moving 
patterns can then be Obtained in a straight forward manner. A 
moving pattern is a large moving sequence that is not 
contained in any other Moving patterns. For example, 
suppose that {AB, BC, AE, CG, GH} is the set of large 2- 
moving sequences and {ABC, CGH} is the set of large 3- 
moving sequences. Then, the resulting user moving patterns 
are {AE, ABC, CGH}. User moving patterns are associated 
with the areas that users frequently travel in a mobile 
computing system. The overall procedure for mining moving 
patterns is outlined as follows: Procedure for Data mining of 
moving patterns. 

Step 1 (Data collection phase). Employing algorithm MM 
(to be described in Section 3.1.1) to determine maximal 
moving sequences from a set of log data and also the 
occurrence count of moving pairs. 

Step 2 (Data mining phase). Employing algorithm LM (to 
be described in Section 3.1.2) to determine large moving 
sequences for every maximal moving sequence obtained in 
Step 1, where is the retrospective factor which is an adjustable 
window size for the recent maximal moving sequences to be 
considered. 

Step 3 (Pattern generation phase). Determine user moving 
patterns from large moving sequences obtained in Step 2, 
where user moving patterns are those frequent occurring 
consecutive sub sequences among maximal moving sequences. 
Note that in the data collection phase, the occurrence counts of 
moving pairs are updated online during registration procedure. 
For purposes of efficiency, algorithm LM is executed to 
obtain new moving patterns in a Data manner for every w 
maximal moving sequence generated, where the unit of w is 
the number of maximal moving sequences. The selection of w 
will be determined empirically in Section 4 later. As users 
travel, their moving. Patterns can be discovered Data to reflect 
the user moving behaviors. 

3.1. XFindingMaximal Moving Sequences 

As pointed out earlier, a moving pair, (old VLR, new 
VLR), is generated in a movement log for each registration 
procedure. Given a moving sequence f.01;Nl.; .02;N2.; 
:::.On;Nn.g of a user, we shall map it into multiple 
subsequence, each of which represents a maximal moving 
sequence. First, we can be obtain a moving sequence 
f.01;Nl.; .02;N2.;On;Nn.g for each mobile user from the 
movement log, where pairs of .Oi;Ni. are sorted by time. 
Then, algorithm MM(standing for maximal moving 
sequence), whose algorithmic form is given below, is applied 
to moving sequences of each mobile user to determine the 
maximal moving sequences of that user and update the 
occurrence count of moving pairs during registration 
procedure. In algorithm MM, Y is used to keep the current 
maximal moving sequence and F is a flag to indicate if a 
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node is revisited. Let DF denote the database to store all the 
Resulting maximal moving sequences. Also, S is the home 
location site of a mobile user. According to the round-trip 
Model considered [20], [31], the selection of S is either VLR 
or HLR whose geography area contains the homes of Mobile 
users. In order to capture the complete traveling sequence, 
algorithm MM outputs a maximal moving Sequence to DF 
until the S is reached. In line 1 of algorithm MM, some 
parameters are initialized. Then, moving Sequences are 
scanned in line 2. 

A maximal moving sequence is output and a new 
maximal moving sequence will be explored (from line 14 to 
line 18 of algorithm MM) if MM finds that Ni in the moving 
pair .Oi;Ni. is the same as the starting site S. Otherwise, Ni is 
appended into Y (in line 12 of algorithm MM) and the 
occurrence count of.Oi;Ni. is updated online in the database 
(in line 14 of algorithm MM). An example execution scenario 
by algorithm MM for the input in Fig. 3 is given in Table 1. 
As mentioned earlier, algorithm MM is different from 
algorithm MF in that unlike the latter, the former neither 
outputs a maximal moving sequence when back war travels 
are encountered (such as in the 6th move) nor updates the 
occurrence count of .Oi;Ni. online. Instead, algorithm MM 
will keep tracing the user movements until the starting point is 
reached and update the occurrence count of .Oi;Ni. online so 
as to reduce the overhead of database scan for generating large 
moving pairs. 



TABLE I. Execution result of algorithm in status report. 



Movement 


String Y by 


Maximal moving 


1 


AB 


AB 


2 


ABC 


ABC 


3 


ABCD 


ABCD 


4 


ABCDE 


ABCDE 


5 


ABCDEF 


ABCDEF 


6 


ABCDEFG 


ABCDEFG 



Algorithm MM/* Algorithm MM for finding 

maximal moving sequences */ 
Input: A moving sequence f.01;Nl.; .02;N2.; 
:::.On;Nn.g of a mobile user. 
Output: Maximal moving sequences of the mobile 
user, begin 

1. Set i to 1 and string Y to null , where Y is used to 
keep the current maximal moving sequence and S is the 
starting point. 

2. while (not end of moving) 

3. begin 

4. Set A . Oi and B . Ni; 

5. if (A .. S ) 

6. begin 

7. Set Y=S; 

8. Append B to Y ; 

9. end 

10. else 

11. begin 

12. Append B to string Y ; 

13. Update the occurrence count of (A,B) in database 
DF; 

14. if (B .. ) 

15. begin 

16. Output string Y to database DF ; 
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17. Set Y to null; 

18. end 

19. end 

20. i++; 

21. end 
End 

3. 1 .2 Finding Large Moving Sequences 

With the maximal moving sequences obtained, we next 
determine the large moving sequences. A large moving 
sequence can be determined from all maximal moving 
sequences of each individual user based on its occurrences in 
those maximal moving sequences. Use intra sequence count to 
mean the number of occurrences of a moving sequence within 
a maximal moving sequence and inter sequence set of a 
moving sequence to mean the set of maximal moving 
sequences which contain that moving sequence. The count of 
a large moving sequence is the sum of intra sequence counts 
from its inter sequence set. For the example in Table 2, the 
intra sequence count of GB in {ABCGBCGBA} is 2 and that 
in {ABGBA} is 1. Also, the inter sequence set of GB is 
{{ABCGBCGBA}, {ABGBA}}. Hence, the count of GB is 
the sum of intra sequence counts in its inter sequence set (i.e., 
2 (i.e., intra sequence count in ABCGBCGBA) + 1 (i.e., intra 
sequence count in ABGBA) = 3). To cope with this problem, 
we develop algorithm LM (standing for large moving 
sequence) for the determination of large moving sequences. 
Let Lk represent the set of all large k-moving sequences and 
Ck be a set of candidate k-moving sequences. 

Algorithm LM/* Algorithm for finding large 

moving sequences */ 
Input: A set of w maximal moving sequences of a 

mobile user. 
Output: Large moving sequences of the mobile 
user, begin 

1. Determining L2= {large 2-moving sequence} 
from moving pairs in C2; 

2. for .k . 3;Lkj?l 6. 0; k... 

3. begin 

4. Ck . Lkyl _ Lkyl; /* Generating Ck fromLkyl 
Lkyl 

*/ 

5. for w maximal moving sequence S 

6. begin/* Calculating the intra sequence count of Ck 
within S */ 

7. intra sequence = sub sequence s; S.; 

8. if (intra sequenco 0) 

9. Including S into inter sequence set;/* sum of 
occurrence counts in a Inter sequence set */ 

10. for all candidate c 2 inter sequence 

11. c:count . c:count . c: intra sequence; 

12. end 

13. Lk . fc 2 Ckjc:count _ support g; 

14. end 
end 

As pointed out in [27], the initial candidate set 
generation, especially for L2; is the key issue to improve the 
performance of data mining. Since occurrence counts of 
moving pairs, i.e., C2, were updated online in the data 
collection phase, L2 can be determined by proper trimming on 
C2 efficiently (line 1 of algorithm LM), showing the 
advantage of having online update in algorithm MM. Also, 
note that Ck can be simply generated from Lkyl / Lkyl (line 4 
of algorithm LM). For example, with the set of L2 being { AB, 



nBK}, we have a C3 as {ABK}. As explained above, the 
occurrence count of each k-moving sequence is the sum of 
intra sequence counts (from line 5 to line 9 in algorithm LM) 
in its inter sequence set (i.e., line 10 and line 11 in algorithm 
LM). Note that this step is very different from that in mining 
the path traversal patterns [6] where there are no loops in a 
moving sequence (i.e., the corresponding intra sequence count 
is always zero). The occurrences of each k-moving sequence 
in Ck are determined for the identification of Lk. After the 
summation of the occurrence counts in the inter sequence set 
from line 10 to line 11 in algorithm LM, those k-moving 
sequences with counts exceeding the support are qualified as 
Lk (line 13 of algorithm LM). Notice that those large k- 
moving sequences are obtained from w maximal moving 
sequences of that Mobile user, showing the Data mining 
capability of algorithm LM. For illustrative purposes, with the 
maximal moving sequences of a mobile user being 
fABCGBCGBA;ABGBAg; Table 2 shows the corresponding 
counts of C2. 



TABLE II. Intra sequence and total count for intra sequence. 



C2 


Intra sequence 


Total count 


AB 


1 


2 


C 


2 


2 


CG 




2 


BG 




1 


GB 


1 


3 



3.1.3 An Illustrative Example for Data Mining of Moving 
channels 

Consider an illustrative example for two consecutive runs 
of Data mining of moving patterns in Fig. 4, where the 
network topology of a mobile computing system is the one in 
Fig. 1. Fig. 4 shows the run i where the retrospective factor w 
is set to 10. In the data collection phase, algorithm MM 
determines maximal moving sequences and counts the 
occurrences of moving pairs. Using the w*i maximal moving 
sequences generated by MM, LM determines, in the Data 
mining phase, large moving sequences. In our illustrative 
example, we set the value of the support to be 4 and assume 
that the value of i is 1. It can be verified that the set of L2 
is {{NJ}, {NM}, {MN}} and the set of L3 is {NMN}. In the 
pattern generation phase, moving patterns (i.e., fNJ; NMNg in 
Fig. 4) are determined from large moving sequences obtained 
in the Data mining phase (i.e., L2 and L3 in this illustrative 
example). Following the same procedure, moving patterns 
(i.e., fNJ;ON;POPOg in Fig. 4) are generated Dataly as the 
mobile user travels. By having the advantage of online 
updating the occurrence counts of C2. in algorithm MM, 
algorithm LM devised can be effectively applied to explore 
all the moving paths and generate user moving patterns. 
Note that due to the temporal locality of the user moving 
behaviour, the identities of mining patterns discovered may 
vary as the time window advances. It can be verified that the 
procedure of mining user moving patterns is of polynomial 
time complexity. Specifically, with n moving pairs in a 
moving sequence, the complexity of algorithm MM is O.n. 
Similarly to many other algorithms in the large item set 
generation, the complexity of algorithm LM is in proportion to 
the number of LK. As pointed out in [27], the initial candidate 
set generation, especially for L2, is the key issue for the 
performance improvement of mining process. It is worth 
mentioning that since occurrence counts of moving pairs, i.e., 
C2, were updated online in the data collection phase, L2 can 
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be efficiently determined by proper trimming onC2, showing 
an effective feature of the proposed procedure of mining 
moving patterns. As mentioned before, after large moving 
sequences are determined, moving patterns can be obtained in 
a straightforward manner. Moving patterns of a mobile user 
correspond to frequent movements of this mobile user in a 
mobile computing system. Thus, as will be explored in 
Section 3.2 below, in light of mining results obtained from 
algorithms MM and LM, we can utilize the knowledge of user 
moving patterns to devise effective data allocation schemes so 
as to improve the overall performance of a mobile computing 
system. Since the knowledge of user moving patterns is 
discovered Data, the data allocation schemes are thus able to 
dynamically change data replicated servers to fully exploit the 
moving patterns of mobile users. Therefore, if the mobile user 
changes his/her moving behaviour, the data allocation 
schemes devised will be able to adaptively determine the new 
replication plan by using the Data Mining algorithm for 
moving patterns devised. 



{ 



• Data collection phase (By MM): 
mobile user 



10 moving paths of a 



NJ K LP ON 
NON 
NJN 
NON 
NMN 

NJIJKLHDCDCBFJN 
NMN 
NMN 
NJ N 
NMN 

Incremental mining phase: Employing LM from w*I 
moving paths of a mobile user 

ca I cul ate ca ndi date fre que nt in ov i ng pa tte m ( C k ) 
a lid large moving pattern (LJ$) 

Pattern generation phase 

1 (NJ, NMN) 



Data collection phase (By MM): 10 moving paths of a 
mobile user 



NOPOPON 
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N J I J KON 
NJI MN 

NOPOPOKLPOKON 
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NOKON 
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Incremental mining phase: Employing LM from w*(i+l) 
calculate candidate frequent moving pattern (Ck) moving 
paths of a mobile user 



ca I cul ate ca ndi date fre que nt in ov i ng patte m ( C k ) 
and large moving pattern (LJ^j 

Pattern generation phase 



moving paths are: 
(NJ,NO<POT 



Figure 4. Two consecutive runs of Data mining moving patterns. 

3.2 Schemes for Personal Data Allocation 

As mentioned above, the occurrences of remote accesses 
can be reduced if data allocation schemes are devised in light 
of user moving patterns. In this section, we propose efficient 
data allocation schemes based on user moving patterns for 
personal data allocation, where personal data (e.g., personal 
calendar data described in Section 2.1) are those data only 
accessible by each data owner. We first describe in Section 
3.2.1 the data allocation scheme in a fix pattern. Then, we 
devise two data allocation schemes based on two levels 
of user moving patterns, explicitly the set level and the 
path level, in Section 3.2.2 and in Section 3.2.3, respectively. 
3.2.1 DF (Data Allocation Scheme in a Fixed Pattern) In the 
scheme which allocates data in a fix pattern (referred to as 
scheme DF), the replication sites are determined when the 
database is created. Explicitly, the number of replicated sites 
and the sites at which the personal data can be replicated are 
predetermined. Though being adopted in some traditional 
distributed database systems due to its ease of 
implementation [34], scheme DF is not suitable for mobile 
computing environments where mobile users move frequently. 
Consider the example mobile computing system in Fig. 1 
and assume that the replicated servers under scheme DF are 
{AFKP}. As explained before, if mobile user Ul is found to 
frequently travel in service areas of A, B, and C, then the 
advantage of having replicas on F, K, and P is not exploited 
by user Ul. In our experimental studies in Section 4, scheme 
DF will be implemented and evaluated for comparison 
purposes. 

3.2.1 DS (Data Allocation Scheme Based on the Set of 
Moving Patterns) 

As described before, by exploring different levels of 
mining results, two data allocation schemes are developed: 
one utilizes the set level of moving patterns and the other 
utilizes the path level of moving patterns. In this section, the 
data allocation scheme based on the set of moving patterns 
(referred to as scheme DS) is described. Scheme DS takes 
advantage of moving patterns so as to reduce the number of 
remote accesses. Consider the example profile in Table 3 
where the network architecture is the one in Fig. 1. The set of 
replicated servers under scheme DS is the set of servers 
determined from moving patterns of mobile users. For 
example, the set of the replicated servers for Ul is the union 
of moving patterns of Ul (i.e., {AE} [{ABC} = { ABCE}) 
and that of U2 is {BCGF}. In addition, as will be shown in 
Section 4, scheme DS is able to not only increase the hit ratio 
of local data access but also balance the workloads of servers 
since more replicated servers are employed under scheme DS 
than under scheme DF. However, according to scheme DS, 
more replicated sites are employed as the number of sites 
appearing in user moving patterns increases, which, as will be 
seen in Section 4, though reducing the occurrences of remote 
access significantly, may result in too many sites replicated, 
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thus compromising the overall performance improvement 
achieved. In order to eliminate the cost of maintaining 
replicated sites, we shall take the properties of data objects 
into consideration. Clearly, for a read data object, data 
replication is preferable in order to increase the likelihood 
local reads. As a result, scheme DS favours those read 
intensive data objects. On the other hand, for an update data 
object, scheme DS is not appropriate since scheme DS results 
in too many sites replicated, increasing the costs of update and 
communication. In view of this, we shall investigate the 
approach of utilizing the mining results in the path level, 
instead of the set level as in scheme DS. This will allow us of 
employing the technique of prefaces properly and ensuring the 
number of replicated sites not to exceed a predetermined limit 
(Such a limit is called prefetch size). Using the path level 
knowledge, as a mobile user moves, the personal data can be 
per fetched to the candidate sites predicted from user moving 
patterns and the advantage of mining user moving patterns can 
be fully exploited. This is very motivation that the design of 
scheme DP is based on. 



TABLE III. Comparison between mobility and mobile nodes 



USER ID 


Moving Nodes 


Number of 
Mobility 


Ul 


AE,BC 


1500 


U2 


BCGF 


350 


U3 


BCD 


300 


U4 


CGK 


200 



3.2.2 DP (Data Allocation scheme for moving nodes) 

Assume that a mobile user Ui, who is currently at site A, 
has a moving path ABCD. Clearly, before Ui moves into site 
B, fetching the data required to site B will be able to reduce 
the response time of later access. This technique is in 
general referred to as perfecting. In this section, a data 
allocation scheme based on the paths of moving patterns 
(referred to as scheme DP. this case (in line 14 of scheme 
DP). The candidate prefetch sites are determined accordingly. 

Si S2 S3 S4 

( A ~\ J B C \ ( G 

\ Miss ^ HiC ' * Hit * Miss 

Candidate prefetch Candidate prefetch Candidate prefetch 
sites: B, E sites: C, E sites: C, E 

Figure 5. An illustrative example for local data access hits by Ul under 

Scheme DP: 

Input: The current site, user moving patterns and 
Pfetch. Output: The candidate sites for prefetching, 
begin 

1. Initial the set of prefetch sites Pf , where jPf j is the 
number of prefetch sites included thus far; 

2. Search the current site in user moving patterns; 

3. while (Pfetch > jPf j.= _ Pfetch is the predetermined 
prefetch size */ 

4. begin 

5. Search user moving patterns of the corresponding user 
by using current site as the key; 
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6. if (found) 

7. begin 

8. Include the site next to the current site to Pf ; 

9. Select this user moving pattern as a candidate user 
moving pattern used for prefetch; 

10. end 

11. else 

12. if all moving patterns are scanned, go to line 14. 

13. end 

14. return Pf 
end 

For illustrative purposes, different scenarios under 
data allocation schemes DF, DS and DP are shown in Table 4. 
In order to evaluate the efficiency of data allocation schemes, 
the local hit ratio can be formulated as Pkj.l NSij, 
NPsi,j,where Sij represents the jth moving hop of user Ui, k is 
the length of a moving path Sij , and NSijn is the probability 
for user Ui to appear at site Sij . PSij is equal to 1, if data 
can be found at site Sij and PSij is 0, otherwise. Consider 
the example scenario in Table 3 and Table 4, where the 
replicated servers under DS for Ul are {ABCE} and the 
example moving path of Ul is {ABCG}. The length of this 
moving path is 4. Also, SI 1 is A at first. Since the data can be 
found in the set of replicated servers under DS, Psl 1 is set to 

1. Due to the movement of Ul, the next site of Ul is B, 
which is denoted as SI. 

2 . Since the data can also be found in the set of replicated 
servers under DS, Psl2 is set to 1. It can be verified that the 
local hit ratio of Ul under scheme DS can be formulated as 
NA . NB . NC. Also, it can be obtained that the local hit ratio 
of Ul under scheme DF is NA since A is the intersection of 
{ABCG} and {AFKP}. With the execution scenario shown in 
Fig. 5, the local hit ratio of Ul under scheme DP can be 
expressed as NB . NC because B and C are the sites for 
successful prefetching whereas F is not. As can be seen, the 
candidate prefetch sites of DP vary when the corresponding 
user moves. Following the same procedure, the local hit ratios 
for the moving paths by all other mobile users under schemes 
DF, DS, and DP can be obtained. As can be seen from the 
experimental results in Section 4, scheme DP performs better 
than scheme DS in that it can achieve the same high local data 
access hit ratios as DS while incurring a much smaller cost for 
maintaining replicated sites than DS, showing the very 
advantage of employing the path level knowledge to do proper 
prefetches and, thus, suitable for update data objects. 

3.3 Scheme for Shared Data Allocation 

Shared data refers to those data that are used by many 
mobile users. Example shared data include public information, 
Cooperative information, etc. By properly determining the 
set of replicated servers used by a group of mobile users, data 
allocation for shared data is able to increase the local data 
access ratio in the sense of both local and global optimization. 
Local optimization refers to the optimization that the 
likelihood of local data access by an individual mobile user is 
maximized whereas global optimization refers to the 
optimization that the likelihood of local data access by all 
mobile users is maximized. With the user moving patterns 
obtained, we can develop shared data allocation algorithm (to 
be referred to as algorithm SD) to determine the set of 
replicated servers. Algorithm SD is greedy in nature and its 
performance will be evaluated in Section 4 experimentally. 
Recall that moving patterns of mobile users may contain 
different large k-moving sequences, Lk. We first convert these 
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L0 ks into L0 2s and the allocation of shared data will be 
made in accordance with the occurrences of these L0 2s. An 
algorithmic form of SD is given below. Consider for example 
the moving scenario in Table 3, where Ul is the frequent 
moving mobile user (with 1,500 movements) and U2, U3, and 
U4 are no frequent moving users (with 350, 300, and 200 
movements, respectively). An example profile for the 
counting in algorithm SD is given in Table 5, where the user 
occurrence count of L2 is the number of mobile users whose 
moving patterns contain that L2 and the total movement 
occurrence count of L2 is the sum of all the movement 
occurrence counts of that L2 from all mobile users. For 
example, since {AB} can only be found in the moving 
patterns of Ul, the user occurrence count of {AB} is one. 
Also, since Ul, U2, and U3 contain {BC} in their moving 
patterns, the user occurrence count of {BC} is 3. Let nBC.Ui. 
denote the occurrence count of {BC} in moving paths of 
mobile user Ui. The movement occurrence count of {BC} is 
thus the sum of nBC.Ul.;NBC.U2. and nBC.U3.. This 
methodology can in fact be used to achieve both the local 
optimization (referred to as SD-local) and the global 
optimization (referred to as SD-global) in that one can use the 
total user occurrences of an L2 for counting to achieve local 
optimization whereas using the total movement occurrences of 
that L2 for counting to achieve global optimization of SD- 
local as an example. 

TABLE IV. The scenarios under Different Data Allocation 
Schemes for Personal Data Allocation 

(a) Scenario under DF 



User with an example 
moving path 


Replicated 
server under DF 


Local data 
hit of DF 


Ui with ABCG 


AFKP 


N A 


U 2 with BCGF 


BCGF 


N F 


U 3 with BCDH 


AFKP 





U 4 with CGH 


AFKP 





(b) Scenario under DS 


User with an 
example moving 
path 


Replicated 
server under DF 


Local data 
hit of DF 


Ui with ABCG 


ABCE 


N A +N B +N C 


U 2 with BCGF 


BCGF 


N B +N C +N G + 
N F 


U 3 with BCDH 


BCD 


N B +N C +N D 


U 4 with CGH 


CGK 


N C +N G 



(a) Scenario under DP 



User with an 
example moving 
path 


Replicated 
server under 
DF 


Local data hit of 
DF 


U, with ABCG 


AE,ABC 


N B +N C 


U 2 with BCGF 


BCGF 


N C +N G +N F 


U 3 with BCDH 


AFKP 


N C +N D 


U 4 with CGH 


AFKP 


N G 



Algorithm SD:/* Performing both SD-local and SD 
global for shared data allocation */ 
Input: All user moving patterns of mobile users. 



Output: The set of replicated servers, i.e., R. begin 

1. Determine, from all frequent moving patterns obtained 
by algorithm LM, both user occurrence counts (for SD- 
local) and movement occurrences counts (for SD-global) 
of all frequent L02s 

2. Repeat Until jVj . 0; /* V is the number of replicated 
servers yet to determine*/ 

3. begin 

4. Include those L0 2s that have maximal support from the 
set of all L0 2s into the set c-max. Also, c denotes an L2 
pair in c-max. 

5. if jRj . /* R is the set of replicated servers */ 

6. begin 

7. Choose an L2 pair from c-max; 

8. Include this L2 pair into R and exclude this L2 pair 
from the set of all L0 2s; 

9-jVj.jVjy2; 

10. end 

11. else if (9c 2 c-max and R \ c 6. 0) 

12. begin 

13. In c-max, choose an L2 pair that has an intersection 
with pairs in R and exclude this L2 pair from the set of 
all L0 2s; 

14-jVj.jVjyl; 

15. end 

16. else/* In c-max, there is no L2 pair that has an 
intersection with pairs in R */ 

17. begin 

18. Choose an L2 pair from c-max and exclude this L2 pair 
from the set of all L0 2s; 

19. jVj.jVjy2; 

20. end 

21. R.R[c; 

22. end 
end 

Let R denote the set of replicated servers we identify thus 
far. Once the supports of all L2 pairs are obtained, we include 
the L2 which has maximal user occurrence count (i.e., {BC} 
according to the profile in Table 5) into the set R in line 4 of 
algorithm SD. In general, if the number of replicated server, 
jRj, is not equal to the number of replicated servers required, 
select, from existing L2 pairs that have maximal supports (i.e., 
c-max), one that has an intersection with pairs in R (from 
line 12 to line 15 of algorithm SD). The pair {CG} is hence 
selected. This step is similar to Prim's algorithm for finding 
minimal-cost-spanning-tree (MCST) [8]. The difference 
between SD and MCST is that even if the maximal support of 
an L2 pair does not have any intersection with R, this pah- 
can still be included into R as described from line 17 to line 
20 of algorithm SD . After the inclusion of {CG}, R becomes 
{BCG}. Following this procedure, we shall identify and 
include more proper L2 pairs until jRj reaches the number of 
replicated servers required (i.e., j V j . 0). After the inclusion of 
{CG}, there are two possible L2 pairs (i.e., {GF} and 
{GK}) to be considered. Thus, one can choose either K or F to 
be included into the set of R. In this illustrative example, 
without loss of generality, K is included in R. Finally, R is 
composed of the most frequent moving sites for all mobile 
users in the sense of local optimization. The replicated servers 
by SD-global can be obtained similarly. The example 
execution by algorithm SD with the profile given in Table 5. 
Note that the local hit of mobile users using SD is higher than 
that using DF. Also note that the no frequent mobile users 
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(such as U2, U3, and U4. will have better local access hits 
when using SD local than using SD- global. On the other hand, 
a frequent user like Ul performs better under SD-global than 
under SD-local. These agree with our intuition in that SD-local 
deals with user occurrence counts and SD-global considers 
mainly movement occurrence counts, resulting in the situation 
that SD-local will favour consider the execution users 
moving infrequently and SD-global is good for users moving 
frequently. 



TABLE V. An example profile for the counting in algorithm SD 



L2 


User occurrence 


Movement 


AB 


1 


BAB(ul)=800 


BC 


3 


DBC(ul)+nbc=400 


CD 


1 


BCD(u3)=200 


CG 


2 


NCG(U2)+nCG(u4 


GK 


1 


nGK(U4)=100 



IV. PERFORMANCE STUDY 

The effectiveness of using the knowledge on user 
moving patterns for data allocation is evaluated empirically 
in this Section. The simulation model for the mobile system 
considered is described in Section 4. 1 . Experiments results Of 
personal data allocation, including those of DF, DS, and DP, 
are shown in Section 4.2. Experiments results of algorithm SD 
for shared data allocation are shown in Section 4.3, 4.1 
Simulation Model for a Mobile System. To simulate the 
servers in a mobile computing system, we use a four by four 
mesh network [20], [23], where each Node represents one 
server and, hence, there are 16 servers in this model. It is 
assumed that there are 100,000 mobile Users in our 
simulations. A moving path is a sequence of servers accessed 
by a mobile user. The size of each moving path is modelled as 
a uniform distribution between MAXLEN-5 and 
MAXLEN+5. As explained before, the starting position of a 
moving path for a mobile user can be either VLR or HLR and 
is randomly selected between 1 and 16 in each run of 
simulation. The number of operations submitted by a mobile 
user to its nearby server is modelled by a uniform distribution 
between SITEOP-2 and SITEOP+2. After the server has 
completed these operations, the mobile user moves to one of 
its neighbouring servers depending on a probabilistic model. 
Explicitly, the probability that a mobile user moves to the 
server where this user came from is modelled by Pback and 
the probability that the mobile user routes to the other servers 
is determined by .1 y Pback.=.n y 1., where n is the number of 
possible servers this mobile user can use. 

4.1 Experiments for Utilizing Shared Data Allocation 

We now examine the performance of SD and DF for 
shared data allocation with the number of mobile users varied. 
The number of movements by a mobile user is modeled by a 
uniform distribution between 100 and 1,000. Fig. 4 shows the 
local hit ratios of DF and SD. As can be seen from Fig. 5 & 
table 3, algorithm SD significantly outperforms DF. In the 
situation that the difference between the number of 
movements for frequent users and that for no frequent users is 
not prominent, such as in this experiment, SD-global and SD- 
local perform similarly. In order to investigate the difference 
of SD-global and SD-local, we set the number of movements 
of a frequent moving user to follow a uniform distribution 
between 900 and 1,000, and the number of movements of a no 
frequent moving users to follow a uniform distribution 



between 100 and 150. Let Pfm denote the percentage of users 
who move frequently. Fig. 3 shows the local hit ratios of SD- 
global and SD-local with Pfm varied and Table 3 shows the 
total hit counts of SD-global and SD-local with Pfm varied. 
As shown in Table 4, when Pfm increases, the local hit ratio 
of SD-global increases and that of SD-local decreases, 
showing the results from having different replicated servers 
employed by SD-local and SD-global. This also agrees with 
our intuition in that as mentioned before, SD local will favour 
users moving infrequently and SD-global is good for users 
moving frequently. It is worth mentioning that although the hit 
ratio of SD-local is larger than that of SD global when Pfm is 
less than 0.35, the total hit count of SD global is larger than 
that of SD-local, showing the very difference in these two 
optimizations criteria described in Section 3.3. It is noted that 
SD-global achieves the global optimization in that the total hit 
ratio under SD-global is large than that of SD-local despite the 
local hit ratios of some no frequent users under SD-global are 
smaller than those under SD-local. Clearly, the choice of SD- 
global and SD-local will be a design issue that is dependent on 
the system objective. 

V. CONCLUSIONS 

In this paper, we presented a new data mining 
algorithm which involves Data mining for user moving 
patterns in a Adhoc computing for multichannel data 
allocation environment and utilized the mining results to 
develop data allocation schemes so as to improve the overall 
performance of a mobile system. First, we proposed 
algorithms (i.e., algorithm MM and LM) to capture the 
frequent user moving patterns from a set of log data in a 
mobile environment. The algorithm proposed is enhanced 
with the Data mining capability and is able to discover 
new moving patterns efficiently without Compromising the 
quality of results obtained. In this paper, we devised data 
allocation schemes that can utilize the knowledge of user 
moving patterns for proper allocation of both personal and 
shared data. For personal data allocation, two data allocation 
schemes, which involve different levels of mining results, 
have been devised: one utilizes the set level of moving 
patterns (i.e., scheme DS) and the other utilizes the path level 
of moving patterns (i.e., scheme DP). As can be seen, the 
former is useful for the allocation of read data objects, 
whereas the latter is good for the allocation of update data 
objects. The data allocation schemes for shared data, which 
can achieve local optimization (i.e., SD-local) and global 
optimization (i.e., SD-global), were also developed. 
Sensitivity analysis on various parameters, including the 
retrospective factor, was conducted and performance of those 
data allocation schemes was comparatively analysed. It was 
shown by our simulation results that the knowledge 
obtained from the user moving patterns is very important in 
devising effective data allocation schemes. Specifically, for 
the personal data allocation, the local hit ratio of DS is 
improved due to its knowledge of user moving patterns. 
However, scheme DS incurs more replicated sites, thus 
compromising the overall performance improvement achieved. 
By utilizing the technique of prefetching, DP employs the 
approach of data allocation more cost effectively to increase 
the local data hit ratio. From the performance study of share 
data allocation, SD-local will favor users moving 
infrequently and SD-global is good for users moving 
frequently. It is worth mentioning that the knowledge of user 
moving patterns is useful on developing efficient location 
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management and querying strategy, and can lead to significant 
performance improvement in a mobile computing system. 
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